(Feedforward)=
# Chapter 8 -- Feedforward
[<!-- module-dsml badge --><span class="module module-dsml">Data Science and Machine Learning for Geoscientists</span>](module-dsml) 


Let's take a look at how feedforward is processed in a three layers neural net.

<img src="images/feedForward.PNG" width="500">
Figure 8.1

From the figure 8.1 above, we know that the two input values for the first and the second neuron in the hidden layer are

$$
h_1^{(1)} = w_{11}^{(1)}*x_1 + w_{21}^{(1)}*x_2 + w_{31}^{(1)}*x_3+ w_{41}^{(1)}*1
$$ (eq8_1)

$$
h_2^{(1)} = w_{12}^{(2)}*x_1 + w_{22}^{(2)}*x_2 + w_{32}^{(2)}*x_3+ w_{42}^{(2)}*1
$$ (eq8_2)

where the $w^{(n)}_{4m}$ term is the bias term in the form of weight.

To simplify the two equations above, we can use matrix

$$
H^{(1)} = [h_1^{(1)} \;\; h_2^{(1)}] = [x_1 \;\; x_2 \;\; x_3 \;\; 1]
\begin{bmatrix}
w^{(1)}_{11} & w^{(1)}_{12} \\
w^{(1)}_{21} & w^{(1)}_{22} \\
w^{(1)}_{31} & w^{(1)}_{32} \\
w^{(1)}_{41} & w^{(1)}_{4
2}
\end{bmatrix}
$$ (eq8_3)

Similarly, the two outputs from the input layer can be the inputs for the hidden layer

$$
\sigma(H^{(1)}) = [\sigma(h_1^{(1)}) \;\; \sigma( h_2^{(1)})]
$$ (eq8_4)

This in turns can be the input values for the next layer (output layer)

$$
h^{(2)} = w^{(2)}_{11}* \sigma(h^{(1)}_1)+w^{(2)}_{21} *\sigma(h^{(1)}_2)+w^{(2)}_{31}*1 
$$ (eq8_5)

Again, we can simplify this equation by using matrix

$$
H^{(2)} = [\sigma(h_1^{(1)}) \;\;\sigma(h_2^{(1)}) \; \; 1]
\begin{bmatrix}
w^{(2)}_{11} \\
w^{(2)}_{21} \\
w^{(2)}_{31} 
\end{bmatrix}
$$ (eq8_6)

Then we send this value $h^{(2)}$ into the sigma function in the final output layer to obtain the prediction

$$
    \hat{y} = \sigma(h^{(2)})
$$ (eq8_7)

To put all the equation of three layers together, we can have

$$
\hat{y} = \sigma(\sigma([x_1 \;\; x_2 \;\; x_3 \;\; 1]
\begin{bmatrix}
w^{(1)}_{11} & w^{(1)}_{12} \\
w^{(1)}_{21} & w^{(1)}_{22} \\
w^{(1)}_{31} & w^{(1)}_{32} \\
w^{(1)}_{41} & w^{(1)}_{42}
\end{bmatrix}) 
\begin{bmatrix}
w^{(2)}_{11} \\
w^{(2)}_{21} \\
w^{(2)}_{31} 
\end{bmatrix})
$$ (eq8_8)

Or we can simplify it to be

$$
    \hat{y} = \sigma(\sigma(xW^{(1)})W^{(2)})
$$ (eq8_9)

This is the feedforward process: based on the known weights $W$ and input $x$ to calculate the prediction $\hat{y}$.

Finally, it's easy to write code computing the output from a Network instance. We begin by defining the sigmoid function:

In [4]:
def sigmoid(z):
    return 1.0/(1.0+np.exp(-z))

Note that when the input z is a vector or Numpy array, Numpy automatically applies the function sigmoid elementwise, that is, in vectorized form.

We then add a feedforward method to the Network class, which, given an input a for the network, returns the corresponding output:

In [None]:
def feedforward(self, a):
    """Returning the output a, which is the input to the next layer"""
    for b, w in zip(self.biases, self.weights):
        a = sigmoid(np.dot(w, a)+b)
    return a